Skip to content

feat: competitive review pipeline improvements#33

Merged
haasonsaas merged 8 commits intomainfrom
feat/phase1-triage-context-parsing
Mar 11, 2026
Merged

feat: competitive review pipeline improvements#33
haasonsaas merged 8 commits intomainfrom
feat/phase1-triage-context-parsing

Conversation

@haasonsaas
Copy link
Collaborator

@haasonsaas haasonsaas commented Mar 11, 2026

Summary

10 files changed, +2485 lines, 114 new tests (716 total, all passing).

Test plan

  • All 716 tests pass (cargo test)
  • Triage correctly skips lock files, generated files, deletion/whitespace/comment-only diffs (45 tests)
  • Function boundary detection works across Rust, Python, JS, Go, Java (5 tests)
  • All 5 LLM response parsing strategies produce correct output (11 tests)
  • Verification auto-zeros noise, parses scoring format, clamps scores (17 tests)
  • ModelRole falls back to primary when role-specific model not configured (12 tests)
  • Compression selects correct strategy at each stage, respects max_calls, accounts for all file indices (24 tests)
  • Manual: run a review on a large PR to verify compression logging and skipped-file summary

🤖 Generated with Claude Code


Open with Devin

haasonsaas and others added 3 commits March 11, 2026 08:25
…sing

Phase 1 improvements to the review pipeline, informed by competitive analysis
of Greptile, CodeRabbit, and Qodo Merge architectures:

- File triage (#29): Heuristic classifier skips lock files, generated files,
  deletion-only, whitespace-only, and comment-only changes before expensive
  LLM review. 45 unit tests.

- Dynamic context boundaries (#25): Context fetcher now searches upward for
  enclosing function/class definitions instead of using fixed-window context.
  Uses asymmetric padding (more before, less after) matching Qodo's approach.
  6 unit tests.

- Robust LLM output parsing (#28): 5-strategy fallback chain for parsing LLM
  responses — primary format, numbered lists, markdown bullets, file:line
  format, and JSON extraction. Prevents silent comment loss when models deviate
  from the expected format. 11 unit tests.

- Fix rust-embed v8 derive macro (Embed -> RustEmbed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Verification pass (#23): LLM validates review findings against actual code,
  scoring 0-10 with auto-zero for noise (docstrings, type hints, imports).
  Filters hallucinated comments before output. 17 new tests.
- Multi-model routing (#26): ModelRole enum (Primary/Weak/Reasoning/Embedding)
  with per-role model config and fallback chain. 12 new tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4-stage progressive degradation strategy for diffs exceeding token budget:
  Stage 1 (Full): all diffs fit → use as-is
  Stage 2 (Compressed): remove deletion-only hunks, drop tail files
  Stage 3 (Clipped): truncate large files at hunk boundaries
  Stage 4 (MultiCall): split into multiple batches (up to 5)

Integrated into review pipeline — files dropped by compression are
skipped with logging. 24 new tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Prevent compression analysis from skipping files in per-file review mode, make verification diff truncation UTF-8 safe, and tighten comment-only triage to avoid dereference false positives while improving compression packing order.

Made-with: Cursor
devin-ai-integration[bot]

This comment was marked as resolved.

haasonsaas and others added 4 commits March 11, 2026 10:56
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of failing the workflow, emit a notice and skip the review
and comment-posting steps when OPENAI_API_KEY is not set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ding default

- `#` comment prefix: exclude Rust attributes (#[...], #![...]), C
  preprocessor directives (#include, #define, etc.) from comment detection
- `generated/` match: use boundary-aware checks (starts_with or /generated/)
  to avoid false positives on paths like `user_generated/`
- Embedding model default: fall back to primary model instead of hardcoding
  OpenAI's text-embedding-3-small, respecting self-hosted provider support
- 4 new tests for the triage fixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…errors

Use the Weak model role for the verification pass — it's a validation
task that doesn't need the frontier model. This also resolves the
clippy dead_code errors for ModelRole, model_for_role, and
to_model_config_for_role by giving them their first real callsite.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

View 12 additional findings in Devin Review.

Open in Devin Review

Comment on lines +89 to +95
// 3. Deletion-only — all non-context lines are Removed
if all_changes
.iter()
.all(|c| matches!(c.change_type, ChangeType::Removed))
{
return TriageResult::SkipDeletionOnly;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Triage SkipDeletionOnly silently skips review of security-relevant code removals

The new triage system classifies any diff where all non-context changes are Removed as SkipDeletionOnly, causing these files to be silently skipped from LLM review (src/review/pipeline.rs:311-320). This means removal of security checks, validation logic, error handling, authentication guards, or rate-limiting code would never be reviewed.

Example of dangerous deletion-only diff that would be skipped

If a developer removes an authentication check:

- if !user.is_authorized() {
-     return Err("Unauthorized");
- }

This is a deletion-only change. The triage at triage.rs:90-94 classifies it as SkipDeletionOnly, and the pipeline at pipeline.rs:312-320 skips the file entirely — no LLM review occurs for this security regression.

Deletion-only changes are a well-known category of security-relevant refactors. Unlike the pre-existing diff.is_deleted check (which skips fully deleted files — a reasonable heuristic), SkipDeletionOnly applies to files that still exist but had code removed, which is exactly the pattern for accidentally removing safeguards.

Prompt for agents
In src/review/triage.rs, remove the SkipDeletionOnly classification entirely, or make it configurable. The deletion-only check at lines 89-95 in triage_diff() should be removed because deletion-only changes can represent security-relevant code removals (e.g., removing authentication checks, validation, error handling). If kept at all, it should be gated behind a config flag like triage_skip_deletions: bool (default false). Also update the TriageResult enum, should_skip(), and reason() to remove or gate the variant. In src/review/pipeline.rs around line 311-320, the triage skip logic is fine as-is — it just needs the underlying triage heuristic to be less aggressive.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

];

/// Comment line prefixes (after trimming leading whitespace).
const COMMENT_PREFIXES: &[&str] = &["//", "# ", "/*", "*/", "* ", "--", "<!--", "\"\"\"", "'''"];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 COMMENT_PREFIXES includes "* " which false-positives on Markdown list items and other non-comment content

The is_comment_line function at src/review/triage.rs:176-193 uses COMMENT_PREFIXES to detect comment-only changes, but it's file-type-agnostic. The prefix "* " (intended for C/Java/Rust doc-comment continuation lines like * @param x) also matches Markdown list items (* Fix the build process). A diff that only modifies Markdown list items, horizontal rules (--), or headings (# ) in .md/.sql/other files would be classified as SkipCommentOnly and silently skipped from review. For example, changes to README, CHANGELOG, or API documentation that only modify list items would never be reviewed.

Prompt for agents
In src/review/triage.rs, the is_comment_line function (line 176) should either: (1) accept the file extension as a parameter so it can apply language-specific comment prefix lists, or (2) the triage_diff function should exclude known non-code file extensions (.md, .txt, .rst, .adoc) from the SkipCommentOnly classification. The COMMENT_PREFIXES list at line 44 is correct for code files but causes false positives for Markdown/documentation files where '* ' is a list item, '# ' is a heading, and '--' is a horizontal rule.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +33 to +38
if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then
echo "skip=true" >> "$GITHUB_OUTPUT"
echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured"
else
echo "skip=false" >> "$GITHUB_OUTPUT"
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 GitHub Actions secret directly interpolated into shell command allows injection

At line 33, ${{ secrets.OPENAI_API_KEY }} is expanded by GitHub Actions' template engine before the shell runs, meaning the raw secret value is pasted directly into the if [ -z "..." ] command. If the secret contains shell-special characters (e.g., ", `, $(...), ;, |), it can break the script or, worse, execute arbitrary commands.

Safe pattern using environment variable

The standard fix is to pass the secret through an environment variable:

- name: Check API key
  id: check_key
  env:
    KEY: ${{ secrets.OPENAI_API_KEY }}
  run: |
    if [ -z "$KEY" ]; then
      echo "skip=true" >> "$GITHUB_OUTPUT"
    else
      echo "skip=false" >> "$GITHUB_OUTPUT"
    fi

This way the value is never interpolated into the script text itself.

Suggested change
if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then
echo "skip=true" >> "$GITHUB_OUTPUT"
echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured"
else
echo "skip=false" >> "$GITHUB_OUTPUT"
fi
if [ -z "$OPENAI_KEY" ]; then
echo "skip=true" >> "$GITHUB_OUTPUT"
echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured"
else
echo "skip=false" >> "$GITHUB_OUTPUT"
fi
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +288 to +298
pub fn find_enclosing_boundary_line(
content: &str,
file_path: &Path,
line: usize,
_max_search_lines: usize,
) -> Option<usize> {
let language = detect_language(file_path);
let boundaries = detect_function_boundaries(content, &language);
// Find the innermost function containing this line
find_enclosing_function(&boundaries, line).map(|b| b.start_line)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 find_enclosing_boundary_line ignores max_search_lines parameter, violating its documented contract

The function's docstring at src/core/function_chunker.rs:285-287 states it "Searches upward from line up to max_search_lines", but the implementation at line 292 prefixes the parameter with _ and completely ignores it. Instead it scans all function boundaries in the entire file. The caller at src/core/context.rs:56-57 passes 10 and then applies its own post-hoc filter (boundary >= start.saturating_sub(10)) to compensate, but if anyone else calls find_enclosing_boundary_line trusting the documented parameter semantics, the limit won't be respected.

Prompt for agents
In src/core/function_chunker.rs at line 292, the _max_search_lines parameter is unused. Either implement the documented behavior (filter boundaries to only those within max_search_lines above the given line), or remove the parameter from the function signature and update the docstring. If you keep it, the implementation should filter like: find_enclosing_function(&boundaries, line).filter(|b| line - b.start_line <= max_search_lines).map(|b| b.start_line).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@haasonsaas haasonsaas merged commit 1b62f94 into main Mar 11, 2026
6 checks passed
@haasonsaas haasonsaas deleted the feat/phase1-triage-context-parsing branch March 11, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant